Search CORE

136 research outputs found

Versatile Decision Trees for Learning Over Multiple Contexts

Author: J Alcalá-Fdez
J Demšar
JG Moreno-Torres
JG Moreno-Torres
M Sugiyama
S Bickel
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 29/08/2015
Field of study

Crossref

Explore Bristol Research

A Genetic Tuning to Improve the Performance of Fuzzy Rule-Based Classification Systems with Interval-Valued Fuzzy Sets: Degree of Ignorance and Lateral Position

Author: A. Fernández
Akbarzadeh-Totonchi
Alcalá
Alcalá
Alcalá
Alcalá
Alcalá-Fdez
Alcalá-Fdez
Antonelli
Bustince
Bustince
Bustince
Casillas
Cazarez-Castro
Celikyilmaz
Cordón
Cordón
Cordón
Cortes
Coupland
delaOssa
Demšar
Deschrijver
F. Herrera
Fernández
Fernández
Gacto
García
García
García
Gorlzakczany
H. Bustince
Herrera
Herrera
Herrera
Hidalgo
Ishibuchi
Ishibuchi
Ishibuchi
Ishibuchi
J. Sanz
Kaya
Liang
Luengo
Mansoori
Miller
Nojima
Palacios
Park
Sanz
Schaefer
Sheskin
Starczewski
Walker
Wu
Wu
Zarandi
Publication venue: 'Elsevier BV'
Publication date: 01/01/2011
Field of study

Fuzzy Rule-Based Systems are appropriate tools to deal with classification problems due to their good properties. However, they can suffer a lack of system accuracy as a result of the uncertainty inherent in the definition of the membership functions and the limitation of the homogeneous distribution of the linguistic labels. The aim of the paper is to improve the performance of Fuzzy Rule-Based Classification Systems by means of the Theory of Interval-Valued Fuzzy Sets and a post-processing genetic tuning step. In order to build the Interval-Valued Fuzzy Sets we define a new function called weak ignorance for modeling the uncertainty associated with the definition of the membership functions. Next, we adapt the fuzzy partitions to the problem in an optimal way through a cooperative evolutionary tuning in which we handle both the degree of ignorance and the lateral position (based on the 2-tuples fuzzy linguistic representation) of the linguistic labels. The experimental study is carried out over a large collection of data-sets and it is supported by a statistical analysis. Our results show empirically that the use of our methodology outperforms the initial Fuzzy Rule-Based Classification System. The application of our cooperative tuning enhances the results provided by the use of the isolated tuning approaches and also improves the behavior of the genetic tuning based on the 3-tuples fuzzy linguistic representation.Spanish Government TIN2008-06681-C06-01 TIN2010-1505

Elsevier - Publisher Connector

Crossref

Repositorio Institucional Universidad de Granada

Academica-e

A Sensitivity Analysis for Quality Measures of Quantitative Association Rules

Author: B. Alatas
B. Alatas
D. Li
J. Alcalá-Fdez
M. Martínez-Ballesteros
M. Martínez-Ballesteros
R. Pears
V. Pachón Álvarez
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 01/01/2013
Field of study

There exist several fitness function proposals based on a combination of weighted objectives to optimize the discovery of association rules. Nevertheless, some differences in the measures used to assess the quality of association rules could be obtained according to the values of such weights. Therefore, in such proposals it is very important the user’s decision in order to specify the weights or coefficients of the optimized objectives. Thus, this work presents an analysis on the sensitivity of several quality measures when the weights included in the fitness function of the existing QARGA algorithm are modified. Finally, a comparative analysis of the results obtained according to the weights setup is provided.MICYT TIN2011-28956-C02-00Junta de Andalucía P11-TIC-752

Crossref

idUS. Depósito de Investigación Universidad de Sevilla

An ant colony-based semi-supervised approach for learning classification rules

Author: A Halder
AA Freitas
C Ginestet
C Hsu
D Angus
D Martens
F Otero
Fernando E. B. Otero
Gisele L. Pappa
I Triguero
J Alcalá-Fdez
J Wang
Julio Albinati
L Rokach
M Li
Samuel E. L. Oliveira
X Zhu
ZH Zhou
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 26/11/2015
Field of study

Semi-supervised learning methods create models from a few labeled instances and a great number of unlabeled instances. They appear as a good option in scenarios where there is a lot of unlabeled data and the process of labeling instances is expensive, such as those where most Web applications stand. This paper proposes a semi-supervised self-training algorithm called Ant-Labeler. Self-training algorithms take advantage of supervised learning algorithms to iteratively learn a model from the labeled instances and then use this model to classify unlabeled instances. The instances that receive labels with high confidence are moved from the unlabeled to the labeled set, and this process is repeated until a stopping criteria is met, such as labeling all unlabeled instances. Ant-Labeler uses an ACO algorithm as the supervised learning method in the self-training procedure to generate interpretable rule-based models—used as an ensemble to ensure accurate predictions. The pheromone matrix is reused across different executions of the ACO algorithm to avoid rebuilding the models from scratch every time the labeled set is updated. Results showed that the proposed algorithm obtains better predictive accuracy than three state-of-the-art algorithms in roughly half of the datasets on which it was tested, and the smaller the number of labeled instances, the better the Ant-Labeler performance

Crossref

Kent Academic Repository

Feature based multivariate data imputation

Author: A Petrozziello
B Frènay
C Valdiviezo
CJ Willmott
CK Enders
I Jordanov
J Alcalá-Fdez
J Bartlett
J Cohen
J Osborne
JW Graham
M Gòmez-Carracedo
MC Lee
O Troyanskaya
P Schmitt
PA Whigham
S Oba
T Chai
X-Y Pan
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/03/2019
Field of study

Crossref

Portsmouth University Research Portal (Pure)

Ensemble and fuzzy techniques applied to imbalanced traffic congestion datasets a comparative study

Author: A Jurek
C Seiffert
D Mokeddem
D Pescaru
E Bauer
F Harandi
H Finner
J Alcala-Fdez
J Alcalá-Fdez
J Cervantes
J Otero
JR Quinlan
K Savetratanakaree
L Breiman
L Guo
L Rokach
LA Zadeh
M Antonelli
M Galar
M Jesus Del
M Lango
MJ Jesus Del
NV Chawla
P Lim
P Lopez-Garcia
S García
S Holm
S Kotsiantis
S Nama
S Wang
SB Kotsiantis
V López
Y Fang
Y Freund
Z Xu
Z Zhao
Publication venue
Publication date: 11/05/2018
Field of study

Class imbalance is among the most persistent complications which may confront the traditional supervised learning task in real-world applications. Among the different kind of classification problems that have been studied in the literature, the imbalanced ones, particularly those that represents real-world problems, have attracted the interest of many researchers in recent years. In order to face this problems, different approaches have been used or proposed in the literature, between then, soft computing and ensemble techniques. In this work, ensembles and fuzzy techniques have been applied to real-world traffic datasets in order to study their performance in imbalanced real-world scenarios. KEEL platform is used to carried out this study. The results show that different ensemble techniques obtain the best results in the proposed datasets. Document type: Part of book or chapter of boo

Crossref

Scipedia

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

Instance selection of linear complexity for big data

Over recent decades, database sizes have grown considerably. Larger sizes present new challenges, because machine learning algorithms are not prepared to process such large volumes of information. Instance selection methods can alleviate this problem when the size of the data set is medium to large. However, even these methods face similar problems with very large-to-massive data sets. In this paper, two new algorithms with linear complexity for instance selection purposes are presented. Both algorithms use locality-sensitive hashing to find similarities between instances. While the complexity of conventional methods (usually quadratic, O(n2), or log-linear, O(nlogn)) means that they are unable to process large-sized data sets, the new proposal shows competitive results in terms of accuracy. Even more remarkably, it shortens execution time, as the proposal manages to reduce complexity and make it linear with respect to the data set size. The new proposal has been compared with some of the best known instance selection methods for testing and has also been evaluated on large data sets (up to a million instances).Supported by the Research Projects TIN 2011-24046 and TIN 2015-67534-P from the Spanish Ministry of Economy and Competitiveness

Elsevier - Publisher Connector

Crossref

Repositorio Institucional de la Universidad de Burgos

An insight into imbalanced Big Data classification: outcomes and challenges

Author: A Fernández
A Fernández
A Thusoo
B Krawczyk
C Bunkhumpornpat
CP Chen
D Lyubimov
E Elsebakhi
E Ramentol
F Hu
F Hu
G Haixiang
GEAPA Batista
GM Weiss
H He
H Yu
I Triguero
I Triguero
J Alcalá-Fdez
J Dean
J Huang
J Li
JA Sáez
JM Tomczak
K Kambatla
L Rokach
M Galar
M Galar
M Wasikowski
NV Chawla
NV Chawla
PC Zikopoulos
R Baeza-Yates
R Barandela
R Blagus
RC Prati
S Alshomrani
S Barua
S Elhag
S Kamal
S Owen
S Río
S Río
S-H Park
T Jo
T White
V García
V López
V López
V López
X Meng
X Wu
Y Guo
Y Sun
Y-S Chen
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2017
Field of study

Big Data applications are emerging during the last years, and researchers from many disciplines are aware of the high advantages related to the knowledge extraction from this type of problem. However, traditional learning approaches cannot be directly applied due to scalability issues. To overcome this issue, the MapReduce framework has arisen as a “de facto” solution. Basically, it carries out a “divide-and-conquer” distributed procedure in a fault-tolerant way to adapt for commodity hardware. Being still a recent discipline, few research has been conducted on imbalanced classification for Big Data. The reasons behind this are mainly the difficulties in adapting standard techniques to the MapReduce programming style. Additionally, inner problems of imbalanced data, namely lack of data and small disjuncts, are accentuated during the data partitioning to fit the MapReduce programming style. This paper is designed under three main pillars. First, to present the first outcomes for imbalanced classification in Big Data problems, introducing the current research state of this area. Second, to analyze the behavior of standard pre-processing techniques in this particular framework. Finally, taking into account the experimental results obtained throughout this work, we will carry out a discussion on the challenges and future directions for the topic.This work has been partially supported by the Spanish Ministry of Science and Technology under Projects TIN2014-57251-P and TIN2015-68454-R, the Andalusian Research Plan P11-TIC-7765, the Foundation BBVA Project 75/2016 BigDaPTOOLS, and the National Science Foundation (NSF) Grant IIS-1447795

Crossref

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Springer - Publisher Connector

Repositorio Institucional Universidad de Granada

Regression conformal prediction with random forests

Author: A Lambrou
CE Rasmussen
D Devetyarov
D Devetyarov
GW Flake
H Papadopoulos
H Papadopoulos
H Papadopoulos
H Papadopoulos
Henrik Boström
Henrik Linusson
J Alcalá-Fdez
JR Quinlan
K Nguyen
L Breiman
L Breiman
L Breiman
L Makili
LG Valiant
M Friedman
S Garcıa
Tuve Löfström
Ulf Johansson
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref